Search CORE

3 research outputs found

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Author: Aryan Abi
Ding Zihan
Lukasiewicz Thomas
Song Yuhang
Wang Jianyi
Wojcicki Andrzej
Wu Lianlong
Xu Mai
Xu Zhenghua
Publication venue
Publication date: 27/11/2019
Field of study

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/

arXiv.org e-Print Archive

Oxford University Research Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The Costly Dilemma: Generalization, Evaluation and Cost-Optimal Deployment of Large Language Models

Author: Aryan Abi
McMahon Andrew
Meyer Lucas Augusto
Nain Aakash Kumar
Sahota Harpreet Singh
Publication venue
Publication date: 15/08/2023
Field of study

When deploying machine learning models in production for any product/application, there are three properties that are commonly desired. First, the models should be generalizable, in that we can extend it to further use cases as our knowledge of the domain area develops. Second they should be evaluable, so that there are clear metrics for performance and the calculation of those metrics in production settings are feasible. Finally, the deployment should be cost-optimal as far as possible. In this paper we propose that these three objectives (i.e. generalization, evaluation and cost-optimality) can often be relatively orthogonal and that for large language models, despite their performance over conventional NLP models, enterprises need to carefully assess all the three factors before making substantial investments in this technology. We propose a framework for generalization, evaluation and cost-modeling specifically tailored to large language models, offering insights into the intricacies of development, deployment and management for these large language models.Comment: 11 page

arXiv.org e-Print Archive

Sample-to-answer on molecular diagnosis of bacterial infection using integrated lab-on-a-disc

Author: Abi-Samra
Aryan
Breadmore
C.C.H. Leung
Dellinger
Dino
H.C. Kwok
H.P. Ho
I.L.G. Law
Iwamoto
J.F.C. Loo
Kirchner
Krusemark
Kumar
M. Hui
M.L. Chin
Maragakis
Moraes
Oh
P. Kwan
S.K. Kong
S.Y. Wu
Santiago-Felipe
Tian
Uddin
Wang
Wolfe
Y.K. Cheung
Y.Y. Cheung
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref